[fix](fe) Improve MaxCompute catalog validation#64119
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 28829 ms |
TPC-DS: Total hot run time: 169519 ms |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29310 ms |
TPC-DS: Total hot run time: 169725 ms |
|
/review |
There was a problem hiding this comment.
I found one blocking issue in the new MaxCompute namespace-schema validation path. The PR otherwise stays focused on MaxCompute catalog validation and unsupported table/view rejection, with targeted unit tests added, but this validation path can reject valid existing deployments that use private/intranet MaxCompute endpoints.
Critical checkpoint conclusions:
- Goal/test: The goal appears to be earlier MaxCompute validation and explicit rejection of unsupported external tables/logical views. The new tests cover local branching and unsupported table/view checks, but not private/intranet endpoint behavior for the new OpenAPI validation.
- Scope/focus: The code change is mostly focused, though it adds a new SDK dependency and a second MaxCompute client path.
- Concurrency/lifecycle: No new shared mutable concurrency path found. Catalog initialization remains synchronized by ExternalCatalog, but the new remote validation runs during lazy initialization.
- Configuration/compatibility: The new OpenAPI validation does not honor the existing mc.endpoint/private endpoint configuration, which is a compatibility regression for namespace-schema catalogs.
- Parallel paths: Read and write unsupported-table checks were both updated.
- Error handling/observability: Errors are surfaced during catalog initialization with useful context, but the incorrect endpoint selection makes the error misleading for private endpoint users.
- Data correctness/transactions/persistence: No transaction visibility or persisted metadata format issue found in the reviewed diff.
- Performance: No hot-path performance issue found; validation runs at initialization.
- Security model/focus: I read SECURITY.md and threat-model.md because this PR touches external catalog connection/auth material. The finding is an operational correctness/compatibility issue, not a security vulnerability under the model. No additional user-provided review focus was present.
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
afb3991 to
d1e407c
Compare
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 28831 ms |
TPC-DS: Total hot run time: 169441 ms |
|
run buildall |
TPC-H: Total hot run time: 29511 ms |
TPC-DS: Total hot run time: 171043 ms |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
/review |
FE Regression Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
Review summary: no additional blocking issues found in the current head.
Critical checkpoint conclusions:
- Goal and tests: The PR implements
test_connection-gated MaxCompute validation for project access and namespace-schema access, plus read/write rejection for unsupported MaxCompute external tables/views. Added FE unit tests and MaxCompute p2 regression coverage exercise the intended behavior. - Scope/focus: The implementation is focused on MaxCompute catalog validation and unsupported-table safeguards. The split byte-size error-message fix is small and related to property validation quality.
- Concurrency/lifecycle: No new shared mutable state or lock ordering concern found.
checkWhenCreating()runs before catalog registration and outside the catalog manager write lock; the initialized local objects are reused after registration through the existingobjectCreatedlifecycle. - Configuration: No new config item was added. Existing
test_connectionsemantics are used with the default remaining false. - Compatibility/persistence: No storage format, edit log format, or FE-BE protocol incompatibility found. Persisted catalog properties remain unchanged except when users explicitly set
test_connection. - Parallel paths: Read and write paths both now reject unsupported MaxCompute external tables/views before Storage API session work. Namespace and non-namespace validation paths are both covered.
- Error handling/observability: Validation failures are converted to
DdlExceptionwith project/endpoint/credential context. No ignoredStatus-style issue applies in this FE Java path. - Test coverage: Unit tests cover default skip, explicit validation, namespace validation, validation failures, split-size message, and unsupported table/view read/write rejection. Regression coverage was added for
test_connectiontrue/false catalog creation cases. - Performance: Validation only runs when
test_connection=true; no hot-path regression found.
Existing review context: the prior inline thread about namespace validation ignoring the configured endpoint is already known. In the current head, namespace-schema validation uses the configured ODPS client (odps.schemas().iterator(projectName)) instead of a region-only OpenAPI client, so I did not raise a duplicate.
User focus: no additional user-provided focus points were present.
Tests not run by reviewer in this Actions review pass.
What problem does this PR solve?
Problem Summary:
Use the
test_connectioncatalog parameter, to perform ak-sk、 project 、schema validation when creating a maxcompute catalog.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)